acceleration signal
3DPCNet: Pose Canonicalization for Robust Viewpoint-Invariant 3D Kinematic Analysis from Monocular RGB cameras
Ekanayake, Tharindu, Casado, Constantino Álvarez, López, Miguel Bordallo
Monocular 3D pose estimators produce camera-centered skeletons, creating view-dependent kinematic signals that complicate comparative analysis in applications such as health and sports science. We present 3DPCNet, a compact, estimator-agnostic module that operates directly on 3D joint coordinates to rectify any input pose into a consistent, body-centered canonical frame. Its hybrid encoder fuses local skeletal features from a graph convolutional network with global context from a transformer via a gated cross-attention mechanism. From this representation, the model predicts a continuous 6D rotation that is mapped to an $SO(3)$ matrix to align the pose. We train the model in a self-supervised manner on the MM-Fi dataset using synthetically rotated poses, guided by a composite loss ensuring both accurate rotation and pose reconstruction. On the MM-Fi benchmark, 3DPCNet reduces the mean rotation error from over 20$^{\circ}$ to 3.4$^{\circ}$ and the Mean Per Joint Position Error from ~64 mm to 47 mm compared to a geometric baseline. Qualitative evaluations on the TotalCapture dataset further demonstrate that our method produces acceleration signals from video that show strong visual correspondence to ground-truth IMU sensor data, confirming that our module removes viewpoint variability to enable physically plausible motion analysis.
Classification of 24-hour movement behaviors from wrist-worn accelerometer data: from handcrafted features to deep learning techniques
Sameh, Alireza, Rostami, Mehrdad, Oussalah, Mourad, Farrahi, Vahid
Purpose: We compared the performance of deep learning (DL) and classical machine learning (ML) algorithms for the classification of 24-hour movement behavior into sleep, sedentary, light intensity physical activity (LPA), and moderate-to-vigorous intensity physical activity (MVPA). Methods: Open-access data from 151 adults wearing a wrist-worn accelerometer (Axivity-AX3) was used. Participants were randomly divided into training, validation, and test sets (121, 15, and 15 participants each). Raw acceleration signals were segmented into non-overlapping 10-second windows, and then a total of 104 handcrafted features were extracted. Four DL algorithms-Long Short-Term Memory (LSTM), Bidirectional Long Short-Term Memory (BiLSTM), Gated Recurrent Units (GRU), and One-Dimensional Convolutional Neural Network (1D-CNN)-were trained using raw acceleration signals and with handcrafted features extracted from these signals to predict 24-hour movement behavior categories. The handcrafted features were also used to train classical ML algorithms, namely Random Forest (RF), Support Vector Machine (SVM), Extreme Gradient Boosting (XGBoost), Logistic Regression (LR), Artificial Neural Network (ANN), and Decision Tree (DT) for classifying 24-hour movement behavior intensities. Results: LSTM, BiLSTM, and GRU showed an overall accuracy of approximately 85% when trained with raw acceleration signals, and 1D-CNN an overall accuracy of approximately 80%. When trained on handcrafted features, the overall accuracy for both DL and classical ML algorithms ranged from 70% to 81%. Overall, there was a higher confusion in classification of MVPA and LPA, compared to sleep and sedentary categories. Conclusion: DL methods with raw acceleration signals had only slightly better performance in predicting 24-hour movement behavior intensities, compared to when DL and classical ML were trained with handcrafted features.
Automated Brake Onset Detection in Naturalistic Driving Data
Liu, Shu-Yuan, Engström, Johan, Markkula, Gustav
Response timing measures play a crucial role in the assessment of automated driving systems (ADS) in collision avoidance scenarios, including but not limited to establishing human benchmarks and comparing ADS to human driver response performance. For example, measuring the response time (of a human driver or ADS) to a conflict requires the determination of a stimulus onset and a response onset. In existing studies, response onset relies on manual annotation or vehicle control signals such as accelerator and brake pedal movements. These methods are not applicable when analyzing large scale data where vehicle control signals are not available. This holds in particular for the rapidly expanding sets of ADS log data where the behavior of surrounding road users is observed via onboard sensors. To advance evaluation techniques for ADS and enable measuring response timing when vehicle control signals are not available, we developed a simple and efficient algorithm, based on a piecewise linear acceleration model, to automatically estimate brake onset that can be applied to any type of driving data that includes vehicle longitudinal time series data. We also proposed a manual annotation method to identify brake onset and used it as ground truth for validation. R^2 was used as a confidence metric to measure the accuracy of the algorithm, and its classification performance was analyzed using naturalistic collision avoidance data of both ADS and humans, where our method was validated against human manual annotation. Although our algorithm is subject to certain limitations, it is efficient, generalizable, applicable to any road user and scenario types, and is highly configurable.
Tactile Perception in Upper Limb Prostheses: Mechanical Characterization, Human Experiments, and Computational Findings
Ivani, Alessia Silvia, Catalano, Manuel G., Grioli, Giorgio, Bianchi, Matteo, Visell, Yon, Bicchi, Antonio
Our research investigates vibrotactile perception in four prosthetic hands with distinct kinematics and mechanical characteristics. We found that rigid and simple socket-based prosthetic devices can transmit tactile information and surprisingly enable users to identify the stimulated finger with high reliability. This ability decreases with more advanced prosthetic hands with additional articulations and softer mechanics. We conducted experiments to understand the underlying mechanisms. We assessed a prosthetic user's ability to discriminate finger contacts based on vibrations transmitted through the four prosthetic hands. We also performed numerical and mechanical vibration tests on the prostheses and used a machine learning classifier to identify the contacted finger. Our results show that simpler and rigid prosthetic hands facilitate contact discrimination (for instance, a user of a purely cosmetic hand can distinguish a contact on the index finger from other fingers with 83% accuracy), but all tested hands, including soft advanced ones, performed above chance level. Despite advanced hands reducing vibration transmission, a machine learning algorithm still exceeded human performance in discriminating finger contacts. These findings suggest the potential for enhancing vibrotactile feedback in advanced prosthetic hands and lay the groundwork for future integration of such feedback in prosthetic devices.
Predicting Ground Reaction Force from Inertial Sensors
Song, Bowen, Paolieri, Marco, Stewart, Harper E., Golubchik, Leana, McNitt-Gray, Jill L., Misra, Vishal, Shah, Devavrat
The study of ground reaction forces (GRF) is used to characterize the mechanical loading experienced by individuals in movements such as running, which is clinically applicable to identify athletes at risk for stress-related injuries. Our aim in this paper is to determine if data collected with inertial measurement units (IMUs), that can be worn by athletes during outdoor runs, can be used to predict GRF with sufficient accuracy to allow the analysis of its derived biomechanical variables (e.g., contact time and loading rate). In this paper, we consider lightweight approaches in contrast to state-of-the-art prediction using LSTM neural networks. Specifically, we compare use of LSTMs to k-Nearest Neighbors (KNN) regression as well as propose a novel solution, SVD Embedding Regression (SER), using linear regression between singular value decomposition embeddings of IMUs data (input) and GRF data (output). We evaluate the accuracy of these techniques when using training data collected from different athletes, from the same athlete, or both, and we explore the use of acceleration and angular velocity data from sensors at different locations (sacrum and shanks). Our results illustrate that simple machine learning methods such as SER and KNN can be similarly accurate or more accurate than LSTM neural networks, with much faster training times and hyperparameter optimization; in particular, SER and KNN are more accurate when personal training data are available, and KNN comes with benefit of providing provenance of prediction. Notably, the use of personal data reduces prediction errors of all methods for most biomechanical variables.
Concurrent Haptic, Audio, and Visual Data Set During Bare Finger Interaction with Textured Surfaces
Devillard, Alexis W. M., Ramasamy, Aruna, Faux, Damien, Hayward, Vincent, Burdet, Etienne
Abstract--Perceptual processes are frequently multi-modal. This is the case of haptic perception. Such data set would be useful to conduct the I. T is well known that human perception is often multisensory where different sources of information accessed This observation motivated us to create a multi-modal through different sensory modalities are merged and integrated data set comprising the signals created when a bare finger by the brain. This integration process is thought to increase the explored varied textured surfaces. The measured signals were robustness of the perception of the properties of objects in the stereoscopic images of the surface, the position and speed of face of uncertainty, to resolve ambiguities, and to contribute the fingertip in images coordinates, the load applied by the to the perceptual stability of sensory scenes [1]-[4].
Development and Evaluation of a Learning-based Model for Real-time Haptic Texture Rendering
Heravi, Negin, Culbertson, Heather, Okamura, Allison M., Bohg, Jeannette
Current Virtual Reality (VR) environments lack the rich haptic signals that humans experience during real-life interactions, such as the sensation of texture during lateral movement on a surface. Adding realistic haptic textures to VR environments requires a model that generalizes to variations of a user's interaction and to the wide variety of existing textures in the world. Current methodologies for haptic texture rendering exist, but they usually develop one model per texture, resulting in low scalability. We present a deep learning-based action-conditional model for haptic texture rendering and evaluate its perceptual performance in rendering realistic texture vibrations through a multi part human user study. This model is unified over all materials and uses data from a vision-based tactile sensor (GelSight) to render the appropriate surface conditioned on the user's action in real time. For rendering texture, we use a high-bandwidth vibrotactile transducer attached to a 3D Systems Touch device. The result of our user study shows that our learning-based method creates high-frequency texture renderings with comparable or better quality than state-of-the-art methods without the need for learning a separate model per texture. Furthermore, we show that the method is capable of rendering previously unseen textures using a single GelSight image of their surface.
FG-SSA: Features Gradient-based Signals Selection Algorithm of Linear Complexity for Convolutional Neural Networks
Omae, Yuto, Sakai, Yusuke, Takahashi, Hirotaka
Recently, many convolutional neural networks (CNNs) for classification by time domain data of multisignals have been developed. Although some signals are important for correct classification, others are not. When data that do not include important signals for classification are taken as the CNN input layer, the calculation, memory, and data collection costs increase. Therefore, identifying and eliminating nonimportant signals from the input layer are important. In this study, we proposed features gradient-based signals selection algorithm (FG-SSA), which can be used for finding and removing nonimportant signals for classification by utilizing features gradient obtained by the calculation process of grad-CAM. When we define N as the number of signals, the computational complexity of the proposed algorithm is linear time O(N), that is, it has a low calculation cost. We verified the effectiveness of the algorithm using the OPPORTUNITY Activity Recognition dataset, which is an open dataset comprising acceleration signals of human activities. In addition, we checked the average 6.55 signals from a total of 15 acceleration signals (five triaxial sensors) that were removed by FG-SSA while maintaining high generalization scores of classification. Therefore, the proposed algorithm FG-SSA has an effect on finding and removing signals that are not important for CNN-based classification.
Accelerometer-based Bed Occupancy Detection for Automatic, Non-invasive Long-term Cough Monitoring
Pahar, Madhurananda, Miranda, Igor, Diacon, Andreas, Niesler, Thomas
We present a new machine learning based bed-occupancy detection system that uses the accelerometer signal captured by a bed-attached consumer smartphone. Automatic bed-occupancy detection is necessary for automatic long-term cough monitoring, since the time which the monitored patient occupies the bed is required to accurately calculate a cough rate. Accelerometer measurements are more cost effective and less intrusive than alternatives such as video monitoring or pressure sensors. A 249-hour dataset of manually-labelled acceleration signals gathered from seven patients undergoing treatment for tuberculosis (TB) was compiled for experimentation. These signals are characterised by brief activity bursts interspersed with long periods of little or no activity, even when the bed is occupied. To process them effectively, we propose an architecture consisting of three interconnected components. An occupancy-change detector locates instances at which bed occupancy is likely to have changed, an occupancy-interval detector classifies periods between detected occupancy changes and an occupancy-state detector corrects falsely-identified occupancy changes. Using long short-term memory (LSTM) networks, this architecture was demonstrated to achieve an AUC of 0.94. When integrated into a complete cough monitoring system, the daily cough rate of a patient undergoing TB treatment was determined over a period of 14 days. As the colony forming unit (CFU) counts decreased and the time to positivity (TPP) increased, the measured cough rate decreased, indicating effective TB treatment. This provides a first indication that automatic cough monitoring based on bed-mounted accelerometer measurements may present a non-invasive, non-intrusive and cost-effective means of monitoring long-term recovery of TB patients.
Gait Event Detection in Tibial Acceleration Profiles: a Structured Learning Approach
Robberechts, Pieter, Derie, Rud, Berghe, Pieter Van den, Gerlo, Joeri, De Clercq, Dirk, Segers, Veerle, Davis, Jesse
Analysis of runner's data will often examine gait variables with reference to one or more gait events. Two such representative events are the initial contact and toe off events. These correspond respectively to the moments in time when the foot makes the initial contact with the ground and when the foot leaves the ground again. These variables are traditionally measured with a force plate or motion capture system in a lab setting. However, thanks to recent evolutions in wearable technology, the use of accelerometers has become commonplace for prolonged outdoor measurements. Previous research has developed heuristic methods to identify the initial contact and toe off timings based on minima, maxima and thresholds in the acceleration profiles. A significant flaw of these heuristic-based methods is that they are tailored to very specific acceleration profiles, providing no guidelines on how to handle deviant profiles. Therefore, we frame the problem as a structured prediction task and propose a machine learning approach for determining initial foot contact and toe off events from 3D tibial acceleration profiles. With mean absolute errors of 2 ms and 4 ms for respectively the initial contact and toe-off events, our method significantly outperforms the existing heuristic approaches.